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ABSTRACT 

We report the results of a systematic search for signatures of metal lines in 
quasar spectra of the Sloan Digital Sky Survey (SDSS) Data Release 3(DR3), 
focusing on finding intervening absorbers via detection of their O VI doublet. 
Here we present the search algorithm, and criteria for distinguishing candidates 
from spurious Lyman a forest lines. In addition, we compare our findings with 
simulations of the Lyman a forest in order to estimate the detectability of O VI 
doublets over various redshift intervals. We have obtained a sample of 1756 O VI 
doublet candidates with rest-frame equivalent width > 0.05 A in 855 AGN spectra 
(out of 3702 objects with redshifts in the accessible range for O VI detection). 
This sample is further subdivided into 3 groups according to the likelihood of 
being real and the potential for follow-up observation of the candidate. The group 
with the cleanest and most secure candidates is comprised of 145 candidates. 
69 of these reside at a velocity separation > 5000 km/s from the QSO, and 
can therefore be classified tentatively as intervening absorbers. Most of these 
absorbers have not been picked up by earlier, automated QSO absorption line 
detection algorithms. This sample increases the number of known O VI absorbers 
at redshifts beyond z^fes > 2.7 substantially. 
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Introduction 



Our understanding of the nature of the intergalactic medium (IGM) and its evolution 
as traced by the absorption features observed in the sighthnes towards luminous objects like 
quasars, has benefitted tremendously within the past few years both from observational and 
theoretical advances. High-resolution studies of the Lyman a forest have been extended to 
very low column densities due to high-resolution e chelle spectrog r aphs on powerful 8m class 



telescopes hke HIRES (Keck) and UVES (VLT) flHu et al.lll995l : ILu et al.lll996l : iKim et al. 



19971 : iKirkman fc Tytlerl 119971 ) . At the same time, theoretical models incorporating gas 
dynamics, radiative cooling, and photoionisation, have been developed th at can reproduce 



the majority of the observed properties of the quasar absorption spectra (ICen et a. 



Miralda-Escude et al.l Il996t iHernquist et al.l Il996t iPetitjean et al.l Il995l : iDave et al 



1994 



19971 ). 



In this picture, baryonic gas can be encountered in a wide variety of physical conditions. 
Within the Lyman a forest, at least at high redshift where the forest is the main repository 
for baryons, most of the gas resides in relatively cool (T~ 10^ K) low-to-medium overdensity 
structures that are not in dynamical or thermal equilibrium and are mostly governed by 
photoionisation. 

The IGM is expected to be highly ionised with a neutral fraction of hydrogen f(H I) ~ 10~^ 
to 10~^ . In such a medium, metals are also highly ionised. Thus, oxygen, as the most abun- 
dant intergalactic metal, may exist in the O VI state. Since the O VI 1032/1038 A doublet 
is observable from ground-based telescopes for redshifts beyond Zahs ~ 2.0, these transitions 
constitute a primary tool for studying the characteristics of the metals in the IGM at high 
redshift. The density of the IGM is sufficiently low as to allow the prod uction of O VI by 



1996 



: Schave et al. 


2000: 


Simcoe et al. 


2004) 



redshifts can usually be modelled by struct ures that are photoionised (IBergeron et al.l 12002 
Carswell et aPbooi iLevshakov et al.ll2003l ). 



Due to the different wavebands required for detections at different redshifts and the strong 
evolution of the Lyman forest, a variety of different techniques and observational strate- 
gies have to be applied for finding and identifying O VI absorbers. Detection of the 
O VI 1032/1038A doublet from ground based telescopes is only possible beyond redshifts of 
ziim >2.0 due to the 3000 A atmospheric cutoff, and thus stu dies of the absorber stat istics 
below this threshold need to rely on space-based instruments. iBurles fc Tytlerl (119961 ) con- 
ducted the first systematic survey of O VI absorbers at z ~ 1.0, employing FOS on HST to 
study a sample of 11 QSOs, and found 12 suitable candidates of which 9 were expected to 
be real, thus establishing that the number density per redshift interval of these absorbers is 
similar to, if not greater than, that for C IV and Mg II absorbers at the same redshift. 
Towards higher redshifts, despite the advantange of shifting the lines into the range for 
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ground-based instruments, the ever increasing density of the Lyman forest renders unam- 
bigious identifications of O VI absorbers in its midst more difficult than detections made 
longward of the Lyman a forest. Thus, most studies have rehed upon high-resolution and 
high-signal-to-noise ratio spectra. Around redshifts of z ~ 2.0-3.0, a variety of surveys find ev- 
idence for a population of absorbers residing in low to m edium overdense regions, mostly pho 



toion i sed by the ambient extragalactic radi ation field ( ICarswell et al.l l2002t [Bergeron et al. 



2002; 


Aracil et al. 


2004; 


Simcoe et al. 


2004) 



theses surveys ranges from 10 ^ to 10 ^ of the solar value for structures around t he mean den 
sity of the universe at 2; ~ 2.0. Using UVES spectra for 2 lensed quasar systems, iLopez et al. 



( 120071 ) find size constraints for such intervening O VI absorbers with lower boundaries on the 
kpc scale, where no or only very little variation in different lines of sights are seen. This is 
broadly consistent with other studies o f binary QSO sightline inferring correlation lengths of 



the absorbers up to seve ral tens of kpc (jSmette et al. 1995 ; Petitjean et al. 1998 ; Lopez et al. 



2000l ; lRauch et al.ll2001f ). 



An interesting method to statistically infer the C IV abundance from measuring the mean 
C IV optical depth assoc iated with all pixe l s of t he Lyman a forest with similar optical 
depths, was pioneered by ICowie fc Songailal ( 1l998l ). The authors find a general c orrelation 
of rjCiy^ with t(HI) wh en t{HI) > 1.0. This method w as applied to O VI by ISongaila 
( 119981 ) , ISchaye et al.l ( I2OOOI ) and recently lAracil et al.l ( 12005! ) who assert the existence of O VI 
in gas where the H I optical depth is as low as 0.1. There is, however, some ambiguity in the 
interpretation of the results of such statistical p ixel-t o-pixel correlation s . For an extensive 
discussion of the method see lAguirre et al.l ( 120021 ). and lPieri fc Haehneltl ( 120041 ) for a critical 
analysis of the results. 

The mechanism for enriching the IGM with metals to the level inferred by these surveys re- 
mains somewhat unclear, yet there are at least good candidates : early and wide-spread dis- 
seminatio n of metals by Population III star formation on pre-galactic structures at very high 



redsh ifts ( iNath fc TrenthamI Il997l ; iFerrara et al.l I2OOOI ; iBarkana fc Loebl 12001 



Madau et al. 



2001 ) or winds and superwin ds from starbursts within galaxies at later stages ( lAguirre et al. 



2OOII ; lAdelberger et al.ll2003l ) . For the latter mechanism, driving these winds out of the dense 



environments into the low density IGM remains a crucial point of contention, and a variety 
of methods have been suggested ranging fr om supernovae ( ICouchm an & Reeslll986), to e jec- 
tion by mergers ( IGnedin fc Ostrikerlll997l ). or photoevaporation (.Barkana &: Loebl 1200 ll ). 
Table 1 gives an overview of different sur veys for direct O VI detection in a variety of envi- 
ronments and over a wide redshift range. iReimers et al.l ( 120061 ) find 6 different O VI systems 
and a further 8 potential, but blended, candidates in a single object ('H S0747-I-4259) at 1.46 
< Zabs < 1.81, deriving a redshift path density of dNovi/dX ~ 13. ISimcoe et al.l (l2004j ) 
analyse 230 Lyman a forest lines with a hydrogen column density ^hi > 10^^'^ cm~^, for 7 
QSOs with Zem ~ 2.6, and retrieve a total of about 50 accompanying O VI systems with se- 
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cure identifications. Using the capab ilities of VLT and UVES within the 'Large Programme 
: The Cosmic Evolution of the IGM', iBergeron fc Herbert-FortI (120051 ) survey 10 bright QSO 
at 2.1 < Zprn. < 2.8, an d find 136 O VI candidates with 12.7 < logNovi < 14.6 in 51 systems. 
Carswell et al.l (|2002[ ) focus on two Zem = 2 QSOs and hydrogen absorbers with log ^hi > 



14.0. They identify 7 individual O VI lines in 2 interve ning syste r ns in one case, and find 13 
individual lines in 10 intervening systems in the other. iFox et al.l (120071 ) study the frequency 
of O VI absorbers associated with 35 damp ed and sub-d a mped Lyman a systems, and report 
12 detections of which 9 are intervening. iLopez et al.l ( 120071 ) present 10 intervening O VI 
absorbers seen towards two lensed QSO pairs at a median redshift of Zabs = 2.3. Our sample, 
residing at Zats > 2.7, therefore greatly increases the redshift range of known O VI absorbers 
at high redshifts. It w ill allow us to test expectation based upon photoionisation models like 
the ones presented in iDave et al.l (Il998[ ). thereby possibly constraining the physical condi- 
tions of the IGM and the metagalactic UV/X-ray background at high-redshifts. 
In this paper, we have decided to apply a direct pixel-by-pixel search for signatures of strong 
O VI 1032/1038 A features seen in the spectra of SDSS QSOs. This is, to our knowledge, 
the first systematic survey at high redshifts (2.7 < Zem <5.0) at low resolution (R~1800) 
and low signal-to-noise ratio. We demonstrate that there is a redshift window of opportunity 
where the density of the Lyman forest is not yet too high to obliterate the hope for finding 
and identifying relatively narrow metal absorbers, and that the pathlength covered by the 
combined SDSS sample is high enough to expect finding a reasonable number of such ab- 
sorbers given conservative assumptions about the metallicity of the IGM, the abundance of 
O VI due to photoionisation and the average signal-to-noise ratio of the QSO data sample. 
What type of O VI absorbers do we expect to detect via such a direct search without a 
priori selecting on other transitions ? If a sightl ine passes through a g al axy, and if th e re are 
anal ogues to loca l exarn ples surveyed early by iRogerson et al.l (119731 ): I York. D. G. I (Il974| ) 
and lYork. D. G. I (119771 ) for highly ionised metals in absorption, we anticipate to detect in 
such cases O VI mixed with C IV in the gas phase inside that galaxy, like in the disk of 



our G alaxy. Furthermore, a recent, thorough examination of FUSE spectra (IWakker et al. 



20031 ) revealed the occurence of local O VI absorbers in almost all of the more than 100 



sightlines probed by extragalactic background objects, indicating a large covering fraction 
when a sightline passes the Galaxy. While th e majority of these absorbers are probably 
located within the Galaxy (ISavage et al.ll2003l ). especially the high- velocity O VI (relative 
to the Local Standard of Rest) traces a variety of different environments and phenomena, 
most likely including warm/hot gas interaction in an extended Gal actic corona and even 
truly intergalactic gas within the Local Group (ISembach et al.ll2003l ). As an aside, the gas 
residing in the background AGN host galaxy is subject to a strong UV radiation field by the 
central engine itself, and thus might contain more O VI than normal galaxies. And while 
we are per se not interested in such absorbers for the study of intergalactic gas, we have 
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included 'intrinsic' absorbers, as they are b e ing fo und with our algorithm as well. 
From hydrodynamic simulations ICen et al.l (120011 ) assert, however, that the majority of de- 
tectable lines with rest frame equivalent widths EW r{OV 11032 A) > 0.035A do not reside 
in virialised regions like galaxies, groups and clusters, but trace intergalactic gas at only 
moderate overdensities (~10-40). Although their estimated redshift path density of about 5 
such absorbers per redshift of unity drops rapidly by a factor of at least ten when increasing 
the equivalent width limit to 0.35 A, even that lower rate of incidence allows us to detect an 
appreciable number of such strong intergalactic O VI absorbers, owing to the large redshift 
path length probed by the collective SDSS AGN sample. 

To summarise, we expect to detect the signals of O VI absorbers in a variety of environ- 
ments. While this first step of finding O VI rests upon a 'blind' O VI doublet search plus 
an alignment with Lyman a and P lines, in order to ensure high purity of the candidate 
sample, we may then, in a future effort, use independent line detections of different ions 
to better characterise the physical stae and potentially the nature of the absorbers found 
here. Because there are a variety of possible origins of the OVI gas, including outflows, 
intervening galaxies, and the contribution of the AGN host galaxy, this next step is needed 
to arrive at a sample of IGM absorbers. There is evidence for a po pulation of O VI absorbers 



that c ontain little, if any detectable hydrogen, as pointed out by iBergeron &: Herbert-Fort 



(120051 ) ■ and it is clear that our search algorithm presented here cannot detect those. How- 
ever, as we anticipate the need for high-resolution follow-up of our candidates in order to 
co nfirm their nature, we mi g ht th en be able to learn about this oxygen-rich subset (type 
of iBergeron fc Herbert-FortI (120051 )) of absorbers as well. 

The paper is organised as follows : section [2] details our preliminary analyses of whether 
the signal-to-noise ratio of the SDSS spectra and the density of the Lyman forest enables 
our search program. In section [3] we present our SDSS QSO sample selection, and section H] 
gives an overview of the search algorithms we applied. Before we summarise and conclude 
in section [6l we present the results of our search in section [5l The search method developed 
here can be easly tailored to a variety of different ionic species. We stress that follow-up 
observations of the candidates we retrieve with high S/N and high resolution is neccessary 
to determine the reliability of our search algorithm by determining the nature of the features 
we detect. 

Throughout this study we use a cosmology with Ho=71 km s~^Mpc^\ Qm = 0.27 and 
Qx=0.73. Abundances are given by number relative to hydrogen, and vacuum wavelength 
are being used, if not otherwise noted. 



- 6 - 



2. Preliminary feasibility analyses 

With rest-frame wavelengths below the H I Lyman a wavelength of 1215.67 A, the 
O VI 1032A/1038A doublet absorption lines inevitably fall into the Lyman alpha forest of 
systems located along the same sightline with slightly lower redshifts. Thus, two effects may 
prohibit our attempts to find such O VI absorbers : first, the ubiquity of the Lya forest 
lines, especially at higher redshifts, can lead to blending of the oxygen lines with H I lines, 
thus rendering a correct identification of the doublet impossible. Second, two different Ly a 
lines at different redshifts and the accidental occurence of optical depth ratios expected for 
O VI can mimic an O VI doublet, leading to a false identification. 



2.1. The expected number of falsely identified absorbers 

In order to estimate the severity of the latter effect, we have c r eated mock catalogues 



of Lyman forests lines, using t he line density es timate of iKim et al.l (120011 ) and the column 



density distribution function of IHu et al.l (119951 ) : 

— = 9.06 X (1 + 2)2-19±0.27 ^3 g4 ^ ^ Q (^^^ 

f{NHi)dNHi = 4.9 X 10^ X Nh]-^" x dNni (2) 

where ^ denotes the number of Lyman a lines per redshift interval dz, and f(Ni{j) is the 
probability distribution of obtaining a line with column density of neutral hydrogen Nhi 
within the interval ^hi, ^Hi+d^Hi- We then analysed these catalogs, and retrieved all 
line pairs that exhibit the correct wavelength ratio (within a given tolerance) as well as the 
correct ratio of optical depths at the line center : 

n = = 0.99452 (3) 

OVI{W3SA) 

^ r(0VI(m2A)) ^ ^ ^ 

r(OV/(1038A)) ^ 

ri is independent of redshift since both lines get shifted by the same factor (l+z), and the 
result for r2 follows from atomic physics : 

-0 = (5) 

where N is the column density, A the wavelength of the transition, c the speed of light, A^j 
the Einstein-coefficient for the transition from the upper (k) to the lower (j) level, bA = Aq- 
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the line broadening factor. The statistical weights gu/gj of the two transitions is equal to 
two. 

With the given resolution of SDSS spectra (R~1800) and the conservative assumption that 
line centroids can be determined to about 1/3 of the resolution element, we expect a tolerance 
level for the wavelength ratio of = 3.0x10"'^. Furthermore, we can rule out all pairs that 
do not exhibit the range of flux decrements, expected from the ratios of the two line strengths 
convolved with the instrument capabilities. The efficiency of this procedure depends crucially 
on the ability to precisely measure the equivalent width (EW) of lines, but even a conservative 
estimate of a 20% accuracy on a single pixel flux measurement, corresponding to a low signal- 
to-noise ratio of about 5, leads to a reduction of the number of false detections of roughly a 
factor of eight at all redshifts. 

An analysis of the mock data sets spanning the complete redshift range available to us (2.7 
< Zahs < 2;.max (SDSS)) and varying the thresholds on the desired precision for ri and r2 yields 
for the number of expected interlopers: 

ni[Zabs) = ni,o{Zabs){ ^^^^^^j { ) (6) 

where Zabs is the absorber redshift, and rii is the number of expected absorbers per spectrum. 
^i,o{Zabs) is a strongly increasing function of the redshift: 

UifiiZabs) = 2.75 X 10-^(1 + ZabsY-' (7) 

Note that the coefficients and exponents in equations [6] and [7] are derived from analysing the 
mock data set, without taking additional absorption from metal lines into account. Nor did 
we analyse the contribution of random noise features mimicking O VI doublets. The latter 
problem, however, is obviated by the fact that we are going to focus on lines that are clearly 
above the noise limit, as we will explain in section 3. 

We focus on O VI absorbers accompanied by strong H I Lyman absorptioiil]. Therefore, we 
rule out all lines that d o not show accompanying H I transitions. Given the line density 
estimate of iKim et al.l ( 120011 ). the lower limit for the average velocity separation of two 
Lyman a absorbers is 

AvLya = c X = c X -— — — > 30, 000 km/s x (1 + 2;^ ^"^'^^ 



1 + Zabs 9.06 X (1 + Zem)^-^^ X (1 + Zabs) 



^Note that Bergeron et al. (2002), Carswell et al. (2002), and Bergeron & Herbert-Fort (2005) report 
some unusual O VI absorbers with high oxygen abundances (-1 < [0/H] < 0). Such absorbers, classified 
by Bergeron & Herbert-Fort as type 0, could potentially slip through our search criterion as they need not 
be accompanied by strong H I absorption. A priori we cannot determine which fraction of these absorbers 
we might lose, some of them exhibit strong enough H I absorption to pass our criteria (right panel Fig. 2 of 
Bergeron & Herbert-Fort (2005)). 
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where we used the relation Zabs < ^em to obtain the lower limit in the last step, and equated 
An = lE 

The velocity separation of one pixel in SDSS spectra, however, amounts to 

^vsDss = c X ^^P'-'"''' 70 km/s. (9) 
A 

Thus, the probability to have at least one H I line fall right onto the pixel corresponding to 
the redshift of an "O VI pixel" is 

Pinterloper = ~ 2.3 X 10"=^ X (1 + Zen^f''^ (10) 



Note that the iKim et al.l ( 120011 ) line density includes Lyman a absorbers with column den- 
sities as low as log I^hi = 13.64. Restricting ourselves to absorbers that show at least an 
optical depth at line centre for the Lyman /3 line of ro(Lyman f3) = 1.0, leads to another fac- 
tor of istrong ~ 0.4 in the exclusion of spurious interlopers. Therefore, we can further reduce 
the number of spurious interlopers by Pmterioper x f strong, at least, by requiring the candidate 
O VI feature to be accompanied by strong Lyman a and /3 absorption. Since higher order 
series lines become rather weak, adding additional components probably does not follow the 
same simple multiplicative propagation of probabilities, and so we have considered in this 
feasibility study only the first two terms of the Lyman series. 



Figure 1 shows in red the number of expected H I interlopers without (dashed) or with 
(dotted) the additional requirement of accompanying Lyman a and /3 absorption. For these 
calculations we have used the fiducial values of < 3 x 10""^ and < 0.2. Introducing 
the additional requirement significantly reduces the number of false positives, as Figure 1 
demonstrates. 



Finally, we can also disquahfy a potential candidate as belonging to an O VI doublet if 
it is clear that it is part of a Lyman series itself, i.e. we can check whether there are lower (or 
also higher) Lyman transitions at the same redshift with similar velocity structures. This cut 
becomes possible for all lines above 4500 A when the Lyman /3 line redshifts into the lower 
wavelength regime of SDSS. Thus, we expect to even further reduce the number of falsely 
classified O VI absorbers beyond a redshift of zqvi ~ 3.37 by a large fraction. Another 
criterion that can be used to correctly identify a metal line doublet is the shape of the lines 



^Notc that wc do not take line clustering into account here, which leads to a decreased velocity separation 
in areas of high line density. 
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Absorption Redshift Zg^g 
3.0 4.0 5.0 6.0 




0.6 0.7 0.8 

Absorption Redshift log (l+z^bs) 



Fig. 1. — The number of O VI absorption features and spurious H I interlopers expected 
in the spectra of SDSS quasars versus the redshift 'Lahs of the absorber. The rapid increase 
for the O VI hues up to log (l+Zabs) ~ 0.65 {='Lahs ~ 3.4) can mainly be attributed to the 
nearly linearly growing redshift pathlength, whereas the rapid increase of H I interlopers is 
due to the redshift evolution of the line density. We assume an [0/H] = -2.0 to calculate 
the number of absorbers by integrating over the column density distribution by Hu et al. 
(1995), after assessing a correction factor for the fraction of lines retrievable (for details see 
section 2.2). The red, dashed curve represents the expected number of H I interlopers that 
exhibit less than a fiducial value for the deviations in both line position and line strength 
(for details see text), whereas the dotted line takes the additional requirement into account 
that the interlopers have to show associated, strong H I absorption. It is obvious that our 
search efficiency peaks around a redshift of 'Lahs ~ 3.2, and rapidly deteriorates for redshifts 
Zafes > 4.0. Thus, we expect our "window of opportunity" to be between 2.8 < Zahs < 4.0, 
where the ratio of detections of real systems to spurious interlopers is greater than unity. 
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: both are expected to have the same velocity structure and should be rather narrow, in 
stark contrast to the Lyman lines that tend to be broader than 20 km/s. Due to the low 
resolution of SDSS spectra, however, this criterion cannot be implemented here unless the 
H I line is so strong that its width exceeds one resolution element. Such strongly saturated 
or even damped lines are rare, and would have been picked up by other automatic search 
algorithms already, as we describe later. 

We will discuss the implementation and success of these and various other criteria to 
distinguish real O VI lines from H I features mimicking metal lines in section 3. 



2.2. The detectability of O VI doublets in the Ly a forest 

In order to estimate whether blending and/or noise renders the identification of existing 
O VI doublets in the Lyman forest impossible, we have created 100 mock spectra of a typ- 
ical Lyman forest, following the number and column density distributions from above, and 
adjusting the spectral resolution and pixel size to the SDSS fiducial value. For the Doppler 
broadening parameter, we have assumed a Gaussian distribution of width 8 km/s around a 
mean oih — 2hkm/s, the exact values for the distribution are, however, not important at this 
stage of modelling, as the instrumental resolution broadens the line profiles over more than 
the average velocity width. In this forest we placed O VI doublets at a redshift of about 3.5, 
added noise according to the appropriate (worst-case) SDSS scenario (i.e. we have chosen a 
value of S/N = 3 here, which is below the nominal S/N > 4 SDSS limit, see section 3.3 for 
details of the SDSS sample's S/N), and checked carefully how often we could reliably retrieve 
the O VI doublet, when varying the strength of the absorption feature (until it completely 
vanished in the noise). Figure 2 shows two examples : in the first, both components are 
clearly identified, whereas in the second, both are severely blended. Of course, stronger lines 
can be identified more easily. From this analyis, we conclude that at least 15% of systems 
with an EW = log (EW ° ) = —0.8 can be unambigiously identified at a redshift of z 

A1032j4,o6s 

~ 3.5, and hence higher fractions at lower z or greater strength. Because the modelling at 
this stage is only suppposed to render an initial estimate of the line detectability, we have 
not included second-order effects like line clustering, or a detailed description of the effects 
of continuum placement errors. 

How many such absorbers do we expect per sightline ? The above hmit for the EW lies 
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4640 4650 4660 4670 

Wavelength [A] 

Fig. 2. — Two examples of mock normalised AGN spectra with a resolution typical for the 
SDSS set-up. On top of the Lyman a forest features, generated according to the column 
density distribution and line densities discussed in the text, an O VI doublet with EWa,o6s 
= 0.15 A was placed at redshift "Labs ~ 3.5. The positions of the two lines are indicated by 
the dotted line. Then the worst case level of noise for the SDSS spectra was added. Upper 
panel: In this example both absorption lines can be easily identified and they maintain the 
expected optical depth ratio. Lower panel: In this example, both components are blended 
with Lyman forest lines, changing the optical depth ratio drastically. Note also how noise 
affected the 1038 A component and shifted its flux upwards. Clearly, such a scenario renders 
the identification of the doublet impossible. 
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only slightl y higher th a n the Line Observability Index cf. iHellsten et 



(1998)1 



HuetaL 



derive d by iDave et al.l (Il998[ ) for a metallicity of [0/H]= — 2.(o. Integrating the 
(I1995I ) distribution ove r the interval 13.8 < log N^i < 15.4, where the LOX is of order 1.5 
and above (cf. fig.2 of iDave et al.l (Il998l )). and along the sightline to a QSO to a redshift 
Zabs of the absorber, we obtain the number n{zabs) of such lines associated with detectable 
O VI absorption: 



n{Zabs) 



15.4 



13.8 J2.7 



where 4r" 

az 



. {i+zy(i+nMz)-z{z+2)nA ' 



dX 

f{NHi)r{LOX,z,^)d\og Nm^dz 

is the redshift path length, and r(LOX, Zem) is the 
estimate for the fraction of lines correctly identifiable at the emission redshift of the QSO. 
Here we have assumed for simplicity a constant r(LOX) = 0.15, a value based upon our 
simulation at ~ 3.5. The rapid rise of the number of expected lines in Figure 1 is a 
result of the almost linearly growing redshift pathlength J ^dz. It is clear from Figure 1 
that beyond redshifts of Zabs ~ 4.2, the sheer number of interlopers will make unambigious 
detections of O VI features very difficult, and the search efficiency will peak below that 
redshift. 



2.3. Summary of the initial feasibility analyses 

From the simulations and estimates above, we conclude that we will be able to retrieve a 
large enough sample of O VI absorbers in the spectra of SDSS quasars to obtain meaningful 
results for a statistical analysis. Furthermore, we will be able to exclude false identifications 
via a variety of methods, so that we can hope to clean, to a certain degree, our sample of 
O VI system candidates up to redshifts Zem ~ 4.0. 



■^The Line of Sight Observability Index(LOX) is defined properly in Hellsten et al. ( IQQSl) . It predicts a 



rest equivalent width Wxr for an absorp tion line of a g iven species by modelling gas irradiated with a given 



X-ray/UV background with CLOUDY ( Ferland et al.lll99&) . For our purposes, LOX - log (Wxr/lmA) 



''^Here, and in the following, the bracket notation is used to indicate the logarithm of the oxygen to 
hydrogen abundance relative to the sun : [0/H] = log [N(0)/N(H)] - log [N(O)/N(H)]0 
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3. The AGN sample 



3.1. SDSS imaging and spectroscopy 

The Sloan Digital Sky Survey (SDSS) is an imaging a nd spectroscopic survey of the 
sky flYorkll2000l ) using a dedicated wide-field 2.5m telescope (lGunnll2006t iGunn et al.lll998[ ) 
at Apache Point Observatory, New Mexico. Imaging is carried out in a drift-scan mode 



using five broad ugriz bands spanning the range from 3000 to 10,000 A (IFukugita et al. 



19961 ). The 95% completeness limit in the r band is 22.2 mag (on the AB system). Fo r 
more specifi c infor r nation on the imagi ng an d quality control th ereof see iHogg et al.l (120011 ) ; 
Ivezic et al.l (120041): ISmith et al.l (120021 ) and iTucker et al.l (120061 ). Details of the astrometry 



can be found in 



Pier et al 



(120031 ). A variety of algorithms selects objects from the imaging 
data for spec troscopy. These targ ets are arranged on series of plug plates, called tiles, with 
radius 1.49° (IBlanton et al.ll2003l ). each containing provision for a total of 640 targets and 
calibration stars. Optical fibers at the focal plane feed the light from holes in aluminium 
plates to a pair of double spectrographs. The resulting spectra range from 3800 to 9200 A 
with a resolution of R 



AA 



1800. 



The S/N of the SDSS spectra entering the public data 
base are required to be above 4.0 at a fibre magnitude of 19.9 in the red SDSS spectrum and 
a g magnitude of 20.2 in the bl ue (Adelman-McCarthv 2006h. Sp e cific i i iform a tion o n SDSS 
da ta products can be found i n IstoughtonI J2002h and lAbazaiianl J2OO3I . booi l2005h as well 



as 



Adelman-McCarthvl (120071 ). 



3.2. The SDSS absorption-line catalogue 



Quasars within the SDSS survey are mainly selected by colour criteria from the imag- 
ing c ampaign, and subsequently verified by spectroscopic follow-up studies ( iRichards et al. 
20021). Quasa r spectra with absorp t ion lines ha v e bee n published by a numb er of groups 



■Hall! (I2OO2D : iMenou et al.l (I2OOID : iTolea et al.l (l2002f l: iReichard et al.l (l2003f l: iHaU et al. 
( 2OO2I ). mainly focusing on broad absorption features. Several teams have developed pipelines 



isolating and identifying absorption lines and systems in QSO spectra (lyanden Berk et al. 



2OO0I : IRichards et all boOli IVork fc SDSS CoUaboratlo^ boOli IVork et al.l l2005h . Following 



the format of the catalogue by lYork et al.l ( Il99ll ) a comparison of these pipeline-created cat- 
alogues with lists made by visual inspection reveal a completeness of above 95% for such an 
automation. The incompleteness results mainly from blending of the Mg II line, key to the 
detection of QSO abs orption syst e ms, rn ost importantly with night sky emission. 
The main catalogue of I York et al.l (|2005[ ) presents lists of significant absorption lines, equiva- 
lent widths, various line parameters and possible redshifts. Discrete absorption line systems 
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are obtained by matching lines of different transitions of redsliifts witliin a certain range, 
depending on the S/N of the spectra. Different criteria such as number and quahty of com- 
ponents allow for a grading of the systems - all Class A systems e.g. need to contain at least 
four, unblended lines above a 4a detection limit matching in red shift, free of artefacts and 
blends, occuring redward of the Lyman a forest (lYork et al.ll2005l). Th e QSOALS catalogue 
used here is based on the SDSS data release 3(DR3) (lAbazajianll2005l ). 



3.3. Our SDSS QSO sample 

At rest-frame wavelengths of 1031.912 A and 1037.613 A, the O VI doublet starts to 
become visible in the spectra of SDSS quasars only at redshifts of Zabs ~ 2.7. We have thus re- 
trieved all normalised QSO spectra in the QSOALS database beyond this lo wer redshift. Our 



samp le consists of 3702 quasars with redshifts ranging from 2.70 to 5.413 ( ISchneider et al. 



20051 ). Figure 3 shows the redshift distribution of the sample. More than 60% of the sources 
are below a redshift of = 3.5, where we expect to maximise the search efficiency for 
O VI doublet signatures and the cleanliness of the O VI candidate sample. As Figure 3 
demonstrates, the majority of the sources have i band magnitudes between 19.5 and 20.5. 
As the reliability of an absorption line detection is a function of the local signal-to-noise 
ratio, we have determined the average signal-to-noise ratio of all spectra within the region 
of interest for us. This average value is usually below the overall signal-to-noise ratio of the 
spectrum since the flux levels within the Lyman a forests are much lower than beyond it. 
Figure 3 shows the distribution of these average signal-to-noise ratio values with the emission 
redshifts of the sample sources. Especially towards higher redshifts, it is apparent that the 
increasing density of the forest and the general faintness of the sources lead to a severe de- 
crease in the signal-to-noise ratio that will hinder unambigious identifications of absorption 
lines. However, below Zem ~ 3.5 there are enough quasars with signal-to-noise ratio above 
the conservative limit we used in section 2.1 for the feasibility study : 359 spectra are of 
objects with Zgm < 3.5 and have a signal-to-noise ratio within the forest of S/N>5.0. 



4. Search strategy and algorithm for the SDSS quasar spectra 

The absorption systems and line lists of the SDSS QSOALS catalogues primarily contain 
securely identified features, resting primarily on key absorption lines like the Mg II or C IV 
doublets. In many cases, identification of such systems is made possible by the fact that 
the absorption occurs redward of the Lyman a forest and thus in relatively clean parts of 
the spectra. For our search of signatures of the O VI doublet, however, it is clear that 
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Fig. 3. — Upper panel : The redshift distribution of our SDSS QSOALS sample of 
3702 quasars. Note that more than 60% of the sources are below a redshift of Zgm = 3.5, 
where we expect to maximise the search efficiency and cleanliness of the O VI candidate 
samp le. 359 of these have a S/N >5.0. Middle panel : The i mag distribution (taken 
from Schneider et al. ( 20051 )) of our SDSS QSOALS sample of 3702 quasars. The majority 
of the sources exhibit fluxes between 19.5 < i mag < 20.5. Bottom panel : The average 
signal-to-noise ratio per pixel in the Lyman a forest for each source versus the emission 
redshift Zem. Due to the decreased flux in the forest, this value for the signal-to- noise ratio 
often falls substantially below the nominal lower limit of signal-to- noise ratio=4.0 (indicated 
by the dashed line) for the complete spectrum to enter the SDSS data set, especially towards 
higher redshifts for which the sources become fainter, in general, and also the density of the 
forest increases substantially. 
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all of the candidates must be found in the forest. Furthermore, these O VI systems need 
not be accompanied by other metal absorption systems lik e CIV or Mg l l tha t would allow 
unambigious identification outside the Lyman forest (cf. iDave et al.l ( Il998l )). Thus, we 
cannot expect the automated pipelines used to create the SDSS QSOALS catalogues to 
have picked up on the relatively weak features within the Lyman forest in which we are 
interested. Therefore, we had to develop our own algorithms to construct a complete list of 
O VI absorbers at intervening redshifts. After completing our search with this method, we 
performed a cross-check with the QSOALS database to see if the putative O VI candidates 
also show transitions of other ions at the same redshift. The results of these checks are 
summarised in section 5.1. 



4.1. The search algorithm for finding O VI doublets 



For a median Doppler broadening parameter bovi = 16 km/s, obtained by Simcoe et al. 
(2002), it is obvious that nearly all of the O VI lines that we expect to be present in SDSS 
spectra will not be resolved, as 1 A (i.e. the rough pixel size) represents about 70 km/s in 
velocity space at 4000 A, where we expect to have the best search efficiency . Nonetheless, 
we implemented a simple pixel by pixel comparisoijf] for all spectra, characterising each single 
pixel within the wavelength range from the blue SDSS limit of 3800 A up to the maximum 
wavelength allowed for the O VI 1032A component Xmax = (zem + 1) x 1031.912 A by its 
normalised flux and thus optical depth. For each pixel, we derive its hypothetical absorption 
redshift Zabs as if it were the O VI 1032A component. Then we identify the corresponding 
pixels in the spectra that would belong to the second O VI 1038A component as well as the 
Lyman a, (3 and, if possible, 7 lines for that same redshift. Naturally, the overlap between the 
redshifted wavelength range of the O VI 1032A component pixel and the actual wavelength 
range of the pixel containing the corresponding O VI 1038A component is not perfect. And 
thus, the absorption structure of the O VI 1038A transition corresponding to its 1032 A 
counterpart can span in principle two pixels, and hence the ratio of the absorption measured 
in such pairs of 1032/1038 pixels may not be a reliable quantity for identifying O VI anymore. 
The estimates of the line widths above, however, demonstrate that the velocity width of a 
line is most likely going to be only a small fraction of the width of a single pixel - and so 



^One resolution element of an SDSS spectrum consists of at least 2-3 pixels, and thus even a narrow 
feature like a metal absorption line will get spread out over a range of pixels. Our pixel by pixel approach 
here, however, works very well because we select on the equivalent width ratio, which is not affected by this 
dispersion. Of course, we will thus only be able to detect strong features. For a more detailed comparison of 
the single pixel approach versus a search after rebinning to the resolution element size, see the appendix. 
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this scenario will occur only in a small number of cases. Furthermore, by coincidence, the 
wavelength separation of the pixels in SDSS spectra and wavelength ratio specific to the 
O VI doublet create an overlap at the level of 0.94 for all spectra and all pixels. We did, 
additionally, check for potential O VI candidates in the O VI pixel pairs that do overlap 
by only 6%, but the number of candidates retrieved in this way agreed well with just the 
expectations from random interlopers. We therefore estimate that the loss of valid detections 
due to this complication is minimal. 



4- 1.1. Criteria for identification of VI lines and rejection of spurious Lyman series lines 

In order to assess whether an absorption feature exhibits the correct ratio of optical 
depths while not being able to fully resolve the structure of the absorption lines, we need to 
derive a quantitative measure of that ratio from the observed normalised fluxes per pixel. 
Assume that a) the complete profile of an O VI transition fits into one resolution element, 
and b) that this line is the only absorption line within that element (i.e. no blending with 
H I Lyman lines). Then, it is possible to extract the equivalent width of the O VI in that 
pixel from the measured normalised flux@ as 

EW 

r o = I A1032A 

pixel,1032A /W 

where EW ° is the equivalent width of the line in that pixel in the observer's frame, 
AA is the pixel size in wavelength units, and f ° is the normalised flux of the pixel 

^ ^ ' •' pixel, 1032 A ^ 

assumed to contain the O VI 1032A transition. From this estimate of the equivalent width 
EW °, we can in turn derive the expected flux / ° at the corresponding pixel : 

A1032A ea;p,1038A 

•' exp,1038A I r,vv \ •' pixel, IQ32 a' 

EW 

where tew = ew^^^'^^^o ratio of equivalent widths of the two transitions. This ratio 

A1038A 

depends on the strength and velocity width of the absorber. For unsaturated lines it remains 
fairly close to 2, and drops to unity for saturated lines. We have modelled the O VI 1032A 
and 1038 A transitions with Voigt profiles of various Doppler parameters 6, and are thus 
able to compute : 

EW 

Tew = = TEwiEW » , b) 

EW ° A1032yl' ^ 

A1038A 



^We rely upon the flux normalisation via the continuum estimate given in the QSOALS database. 
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as a function of the estimated equivalent width EW °. This introduces a model depen- 

^ A1032A ^ 

dence into our derivation as we need to assume a value for the Doppler parameter b. We find, 
however, that even large variations of the Doppler parameter, 5.0km/ s < b < 20.0km/ s, only 
affect vew on the 30% level. Here we take b = 15km/ s as fiducial value for the Doppler 
parameter. 

The following criteria were used to classify pixels as candidates belonging to an O VI doublet. 

1. Both pixels of the O VI doublet candidate need to exhibit normalised fluxes lower 
than 1.0. This simply ensures that absorption features rather than pixels whose noise 
pushed them over the flux level of unity are selected. 

2. The expected flux f ° is allowed to deviate from the actually measured flux 

ea:p,1038A 

f ° less than a certain threshold 

pixel,1038A 



A/ 



exp,1038A pixel,1038A 



< ki X a 



where a is the standard deviation of this flux difference, estimated from the signal-to- 
noise ratio of each pixel and the uncertainty on the line width mentioned above. The 
effects of varying ki will be described in the next section. 

3. The inferred equivalent width of the stronger O VI 1032A component has to be above 
a minimum limit 

EW » > A;2 X AA 

X1032A,rest— frame 

This enables us to clear our candidate lists from very weak features, i.e. spurious Lyman 
a forest lines with low column densities which we expect to be the main contaminant at 
high redshifts. Note that this criterion is usually automatically fulfllled for candidates 
in spectra with low signal-to-noise ratio passing criterion 4 mentioned below, and thus 
is mostly useful for cases where the signal-to-noise ratio is signiflcantly above the 4.0 
threshold for entering the sample. 

4. The signiflcance of absorptior0 in each pixel should be above a low threshold 

1-0 - f pixel > ksx 1 /signal-to-noise ratio(pixel) x fpi^^i 

While this criterion is already implicitly in place for the expected and measured flux 
difference and the lower limit to the equivalent width, this formulation puts a stronger 



^Note that our definition for "significance" relies upon the following definitions : our signal is the strength 
of absorption in each pixel, i.e. 1.0 - fp^e/ in the normalised units. The ratio of this quantity to the noise in 
the same units, i.e. (1/SNR) x f pixel i is our metric for the significance. 
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limit on the number of candidates. Requiring a high significance for each pixel allows 
us to focus on absorption lines with a high likelihood of being real, rather than noise 
artifacts. In contrast to criterion 3, this cut proves most useful in spectra with low 
signal-to-noise ratios. Note that this criterion introduces a lower threshold for the 
equivalent width that is signal-to-noise ratio-dependent : at a fiducial average signal- 
to-noise ratio of 5.0 (8.0) and a significance level of 4.0 a (2.0 a), it corresponds to a 
rest-frame equivalent width limit for the SDSS set-up of EW^-est = 0.195 A (0.06 A). 

5. Since we expect O VI to be accompanied by H I Lyman features, we require 

ihymanx — /3;(hmit) 

where the x represents either Lyman a, f3 or - if applicable - 7. Due to a lack of 
information regarding the metallicity and the physical state of the absorbing gas, there 
is, a priori, no straight forward way of calculating the H I absorption expected for a 
given value of f °. Therefore, we introduce these hard cuts on the flux for the 

^ pixel,1032A 

associated Lyman scries transitions, and resort to this simple, yet efficient method of 
checking for accompanying H I absorption. 

6. We reject all candidates that show signs of being part of Lyman series absorption, i.e. 
we check (by eye) whether the feature in question could be a Lyman line, accompanied 
by higher (or lower) scries members elsewhere in the spectrum with a similar velocity 
structure. This turns out not to be a severe cut : only 4 of altogether 1760 candi- 
dates passing all other tests had to be eliminated by this procedure. Thus, we find it 
unnecessary to apply a more rigorous test than this check by eye. 

It is clear that the efficiency of detecting O VI candidates and filtering out spurious H I 
Lyman series absorbers is a complicated function of redshift and signal-to-noise ratio of each 
spectrum. Thus, a more fiexible approach, tailored to each object, could result in a higher 
overall yield. We have, nevertheless, applied the aforementioned rigorous, yet simple strategy 
to the complete sample regardless of redshift and signal-to-noise ratio of each source. This 
allows for a more straightforward comparison of the results with expectations based upon 
our initial feasibility study, and certainly facilitates quantitative analyses like completeness 
estimates. And it is certainly the conservative approach. 

We note that possible blends of the two O VI 1032/1038 A transitions with other fines at very 
similar rest- wavelengths constitute an additional complication. Specifically, the presence 
of strong lines of C II might have the 1036 A line of C II overlapping with O VI 1038 
A, and potcntialy shift this blend to a lower wavelength. Also, any system that contains 
appreciable amounts of molecular hydrogen, H2, will also produce a blend with the O VI 1038 
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A transition, but leave the 1032 A component untainteqj. At the redshift range of interest 
for us, however, the separation between the C II and O VI 1038 A hne centres amounts to at 
least 6 A(observed frame), too much to seriously affect our search criteria. Hence, we have 
not implemented any correction factors due to the presence of these features into our search 
algorithms. 

Interestingly enough, none of our good candidates appear in systems that are bona-fide 
damped-Lyman a absorbers (DLAs). While Fox et al. (2007) argue that potentially all 
DLAs contain appreciable amounts of O VI absorption, we caution that in their sample of 35 
high- resolution spectra of DLAs and sub-DLAs covering also the OVI doublet, about 60% 
of the sightlines show blended features due to the high density of forest lines. 

4.1.2. Effects of varying the selection criteria on the number of candidates 

Obviously, the ability to retain most real O VI absorbers in the candidate list, while 
filtering out the contaminating spurious H I features mimicking metal lines, depends crucially 
on the choice of the parameters kj and /a; (limit) from our search criteria. We have extensively 
tested the effects of different choices for these parameters on the numbers and the quality of 
the candidates. 

Figure 4 demonstrates how changing one of the cut criteria while keeping all others 
constant affects the average number of candidates retrieved from the 380 spectra within the 
redshift range 2.9 < z^m < 3.0. These spectra have on average about 300 usable pixels. The 
data points in black indicate the average number of candidates per spectrum when applying 
the full search strategy, while the points in red are derived by introducing a random shift of 
(5 A (random) rather than the true 6\{zabs) for the doublet separation, but keeping the other 
selection criteria (accompanying Lyman transitions, line significance and equivalent widths). 
This procedure effectively measures the frequency of interloper incidences. Note that in all 
cases, we retrieve more candidates for the 'real' search than for the randomised case. We 
have chosen the following parameters for filtering our dataset 

1. Criterion 2 (h) : A/ < 0.3 x 

While at first hand, it seems unnecessary to apply such a strict criterion, an inspection 
of the candidate lists produced with softer limits revealed that too many spurious 
candidates could pass. This is mainly due to the uncertainty in the estimated, expected 



^In fact, there are also two H2 components at 1032.191 and 1032.356 A, but these are much weaker than 
the 1038 A transitions. 
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Fig. 4. — The effects of varying cut criteria on tlie average number of candidates per spectrum 
retrieved from all 380 spectra of the sample within the redshift range 2.9 < z^m < 3.0. For all 
panels, the data points in black result from the fully implemented search algorithm for O VI 
candidates, whereas data points in red are obtained by searching for candidates with random 
redshift separations instead of the fixed one based upon the O VI wavelength ratio, while still 
maintaining all the other criteria. The dotted, vertical lines indicate the choices for the cut 
criteria that minimise the spurious interloper contribution, while simultaneously maximising 
the number of real candidates kept in the sample. Criterion 4 (upper left panel) The ratio 
of the numbers of real to spurious candidates increases up to a significance level of ~ 4(j. 
Criterion 2 (upper right panel) The average number of candidates retrieved is a strongly 
increasing function of the allowed flux deviation of the O VI 1038 A pixel from the expected 
value. The uncertainty in the Doppler width of the absorbers contributes significantly to 
the error allowance in the estimate for the expected flux. Criterion 3 (lower left panel) 
The lower limit on the rest-frame equivalent width of the O VI 1032 A component appears 
well before the turn-over to a much lower retrieval rate of real candidates (~ 0.3 A for 2.9 
< Zabs < 3.0). Criterion 4 is a stronger selector for the low signal-to-noise ratio spectra, thus 
we have chosen this weaker selector mainly to avoid very weak interlopers for candidates 
in high signal-to-noise ratio spectra. Criterion 5 (lower right panel) Requiring a low flux 
value for the accompanying Lyman transitions, removes effectively a large fraction of the 
random H I lines. Note that applying even stricter limits will throw out good candidates 
in low signal-to-noise ratio spectra when noise can push a Lyman pixel over the threshold 
limit. 
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EW of the O VI 1038A feature when not knowing the doppler width of the absorbers. 
While obviously potentially eliminating good candidates in slightly more noisy spectra, 
the gain in clearing the sample from H I absorbers mimicking O VI clearly overweighs. 

2. Criterion 3 (fca) : EW » > 0.05 x AA 

The lower limit on the equivalent width of a line introduced by this cut occurs well 
before the turn-over to a much lower retrieval rate (at EW ~ 0.3 for the case of 3.9 < 
Za6s < 4.0, see figure 4), but still eradicates effectively a higher number of spurious very 
low column-density H I absorbers that tend to become ubiquitous at higher rcdshifts. 
We have chosen this low limit mainly to keep the (few) cases of spectra with very high 
signal-to-noise (for SDSS standards, i.e. signal-to-noisc ratioj„.es( > 10.0). Note that 
the lower EW limit introduced by this cut depends on the wavelength coverage of a 
pixel, and is thus slightly redshift dependent. 

3. Criterion 4 (A;3) : 1.0 - f {pixel) > 4.0 x 1/ S/N(pixel) x/pi^^^. 

The number of candidates is a strongly declining function of the detection significance, 
as demonstrated by figure 4. However, above a significance level of ~ 4.0(7 the curve 
shows signs of flattening, before, finally, at even higher values one simply runs out 
of pixels above the required signal-to-noise ratio. Thus we have decided to apply the 
above limit, which turns out to be the strongest selector in cutting down the numbers 
of candidates. 

4. Criterion 5 : Flux limit for Lyman lines 

• fLymana — 0.4 

• fLymanp < 0.6 (if applicable) 

• fLyman-f < 0.8 (if apphcable) 

Limiting ourselves only to candidates that arc accompanied by potential Lyman fea- 
tures of such depth maximises the ratio between real O VI absorbers and random 
interlopers, as can be seen from figure 4. Requiring even lower flux values throws out 
mainly potential candidates in low signal-to-noise ratio spectra, when noise pushes 
a Lyman pixel over the threshold, while softening the flux criterion leaves too many 
spurious interlopers in the sample. 

5. In a flnal step, all of the candidates having passed the above criteria, were checked 
for possible higher (and also lower) associated Lyman series lines. Only 4 (out of 
1756) candidates had to be eliminated this way, which we take as a sign of having 
implemented already rigorous cuts to the sample with the criteria discussed above. 
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5. O VI doublet candidate list 

Figure [5] shows the results of applying the search algorithm with the cut criteria men- 
tioned in section 4 to the full dataset of 3702 spectra. For each redshift bin, the black data 
points in the lower panel represent the average number of candidates retrieved per spectrum. 
This number includes both real candidates and spurious interlopers. The red dotted line is a 
linear fit to the number of such random interlopers derived by averaging over 15 runs of the 
search algorithm when applying a random redshift offset to the location of the O VI doublet 
line (properly at 1038 A). The redshift evolution of these (noc (1 + ZatsY'^) agrees very well 
with the expectations from the initial feasibihty study (cf. equation 7 and fig. 1). Note that 
the number of 'real candidates' plus 'spurious interlopers' for the full search is always higher 
than in the random search process, lending further credibility to the robustness of our search 
strategy. The average approximate ratio of 'real' to 'random' candidates per redshift interval 
is presented in the upper panel of fig. [51 The number of 'real' candidates is assumed to be 
the difference between the black data points and the estimate for the interloper frequency 
(red dotted line in lower panel of that figure). As expected, the ability to produce a clean 
sample diminishes rapidly beyond redshifts of Zabs ~ 4.0. 

Altogether, we retrieve 1756 candidate^ in 855 different AGN spectra (out of a total of 
3702) passing all of the above criteria. These candidates still exhibit a wide range of prop- 
erties. We examined each case manually, and grouped them into three categories depending 
on the quality and likely interpretation. 

• High Quality Sample : 

The highest quality, most unambigious candidates for O VI absorbers. All lines are 
strong and the OVI doublet plus Ly a and Ly /3 transitions are not obviously blended. 
The putative H I absorption is both stronger than the O VI feature and exhibits the 
expected line ratios for Lyman a to Lyman /30 145 candidates could be placed in this 
group. In 76 cases, the absorption redshift is so close to the QSO's emission redshift 
that these absorbers are tentatively classifiable as intrinsic {Av = c x ^j^^"^"'"' < 5000 



®We note that in 232 cases two pixels exactly adjacent to each other were selected. Hence, the total 
number listed in table 2 is 1756 + 232/2 — 1866. These 116 absorber candidates are also picked up by the 
search after rebinning. Such cases are exactly what is expected when the noise is low enough to allow a line 
that will be instrumentally broadened over 2-3 pixels to retain its shape. For details see the appendix. 

^"^We allow the Lyman 7 line to be blended or even non-existent due to its inaccessibility in the spectra for 
absorbers of the lowest redshift. Maintaining a strict non-blending criterion for Ly 7, which is much weaker 
than Ly a and /?, would have left us with too few good candidates. 
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Fig. 5. — The results of the 'real' and 'random' search algorithms applied to the 3702 
QSO sample. This graph can be directly compared to Figure 1 where the expected values 
based upon theoretical estimates for the Lyman a forest density and the metallicity of the 
IGM are presented. Lower panel: For each redshift bin, the black data points represent 
the average number of candidates retrieved per spectrum. This number includes both real 
candidates and spurious interlopers. The red dotted line is a linear fit to the number of 
such random interlopers derived by averaging over 15 runs of the search algorithm when 
applying a random redshift offset to the location of the O VI doublet line (properly at 1038 
A). The resulting redshift evolution of these interlopers agrees very well with the estimates 
from the feasibility study in paragraph 1. For details see text. Upper panel : The average 
approximate ratio of 'real' to 'random' candidates per redshift interval. The number of 'real' 
candidates is assumed to be the difference between the black data points and the estimate 
for the interloper frequency (red dotted line in lower panel). Note the decreasing efficiency 
to produce a clean sample with increasing redshift. As expected from figure 1, the "window 
of opportunity" lies roughly between 0.55 < log{l + Zats) <0.7 (= 2.7 < Zabs < 4.0), when 
the ratio real/random rises above unity. 
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Fig. 6. — The rest-frame equivalent width distribution of the 1038 A component for the 1756 
candidates found by our search. Note that we have 'measured' the equivalent widths in the 
normalised spectra by simply only including the pixel to the left and right of the candidate 
pixel into the calculation, and have made no attempt to take the often complicated structure 
of the forest around the central pixel of the candidate into account. Given that one resolution 
element should in almost all cases encompass the narrow OVI lines (b parameters of the order 
of 16 km/s) this method overestimates the width, but does provide secure upper hmits. 
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km/sj^. The remaining group of 69 good candidates will be the prime sample for 
possible follow-up studies. 

• Medium Quality Sample : 

These candidates generally exhibit a secure identification of many of the features in 
question, but also at least one characteristic that leaves some doubt, e.g. a blending of 
one line, weak H I features in comparison to the O VI or excessively broad O VI lines. 
586 (30%) candidates fall into this category. 

• Low Quality Sample : 

In this class of the lowest quality candidates, we encounter mostly examples with very 
low overall signal-to-noise ratio within the forest or features that are probably strong 
Lyman a absorbers at redshifts low enough such that we cannot see Lyman /3 to apply 
our exclusion criterion. 1029 of the 1756 candidates (~ 60%) fall into this category, 
which is considerably higher than the fraction we expected to be spurious interlopers 
from our initial feasibility studies, and thus leads us to the conclusion that even in this 
group there are real O VI absorbers. 

Figures 7, 8 and 9 highlight examples of each of the categories. 



5.1. Cross-checking the candidate list with the QSOALS database 

Having constructed the candidate list of O VI absorbers with the search algorithm de- 
scribed above, we performed a cross-check with the QSOALS database in order to determine 
whether there are additional ions seen at the same redshift. Here, we have focused first 
on the 69 candidates for intervening absorbers of the highest quality, but will extend the 
cross-check fully in a forthcoming paper. We have checked for other metal lines accompa- 
nying the O VI candidates within a range of ±800 km/s. For 10 of these candidates the 
QSOALS database lists secure identifications of absorber systems at redshifts very close to 
value determined from the O VI doublet. In all of these cases, the velocity difference between 
the O VI absorber determined by our search algorithm and the listed QSOALS redshift for 
the absorber system is less than 300 km/s. These absorbers show transitions in a variety of 
other ions, most prominently C IV 1548/1550 A, C II 1335 A, Si IV 1394/1402 A, Al II 1671 
A, and, in 2 cases of low enough redshift not to push the lines beyond the SDSS spectral 



^^Note that the v elocit y separation a l one d oes not guarantee the systems to be intervening. See 
Richards et all |l999l ) and iMisawa et al.l (|2007l ) for cases where high velocity absorber can be regarded 



as intrinsic to the AGN 
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Fig. 7. — Example of a high quahty O VI absorber candidate. All lines are strong and not 
obviously blended (except for the Lyman 7 feature, which we require to be present, but allow 
for it to be blended). The putative H I absorption is both stronger than the O VI feature 
and exhibits the expected line ratios for the Lyman series {a to /3). The position of the 
O VI doublet and the associated Lyman a,/3,^ hnes are indicated by the dotted hnes. The 
plate, fiber and Julian date identification for the SDSS spectrum, as well as the emission 
and absorption redshifts together with the i magnitude and the average signal-to-noise ratio 
within the Lyman forest are given at the top of the graph. Note that the local S/N, used 
to compute the significance of a feature, may be different from the average value given here, 
and is always taken from the SDSS pipeline error array estimate. 
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Fig. 8. — Example of an intermediate quality O VI absorber candidate. The position of the 
O VI doublet and the associated Lyman a, f3 lines are indicated by the dotted lines. The 
plate, fiber and Julian date identification as well as the emission and absorption redshifts 
together with the i magnitude and the general signal-to-noise ratio within the Lyman forest 
are given at the top of the graph. These candidates generally exhibit a secure identification 
of many of the features in question, but also at least one characteristic that leaves some 
doubt, e.g. a blending of one fine (as in the case here for the O VI 1032A component), 
weak H I features in comparison to the O VI or excessively broad O VI lines. Note that the 
example here also highlights a case where the absorber redshift is too low to have the Lyman 
7 transition in the spectrum. 
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Fig. 9. — Example of a poor quality O VI absorber candidate. The position of the O VI 
doublet and the associated Lyman a, (3 lines are indicated by the dotted lines. The plate, fiber 
and Julian date identification as well as the emission and absorption redshifts together with 
the i magnitude and the general signal-to-noise ratio within the Lyman forest are given at 
the top of the graph. These candidates generally allow only for very insecure identifications 
of many of the features in question, mostly because of low overall signal-to-noise ratio in 
the forest (as in the case presented here), but also blending and even a few cases where a 
strong Lyman a system could not be identified as such by our method due to the lack of 
accompanying higher order Lyman series transitions because of its low redshift. About 60% 
of the candidates fall into this category. 
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window, even Al III 1855/1863 A. Not only does the existence of the other transitions in 
these absorbers greatly increase the likelihood for the O VI absorbers to be real, but it will 
also enable us to study the physical state of the absorbing medium via the full, estabhshed 
arsenal of astrophysical diagnostics. The in-depth analysis of these absorbers will be the 
central theme of forthcoming papers. 

While the remaining 59 candidates have no absorption systems listed in the QSOALS 
database, there are cases where the database indicates the detection of an absorption feature 
at the location expected from our O VI redshift, but does not assign an identification, prob- 
ably due to a failure to securely identify the transition. We have checked for the presence 
of the following ions : C IV 1548/1550 A, C II 1335 A, Si IV 1394/1402 A, Al II 1671 A, 
and Al III 1855/1863 A. These are the primary transitions that are expected to be strong 
enough and, due to their rest wavelengths, can occur beyond the Lyman forest in the part 
of the spectrum where the detection is thus much easier. For 36 out of the 59 candidates, 
we do not find any securely detected absorption feature in the QSOALS catalogues. In 11 
cases, there is one of the transitions listed, and in 8 cases 2 absorption features are present. 
For another two absorbers, we found 3 features each, and in the two best cases, even 4 of 
the 8 transitions in question could be retrieved. Given the low likelihood that these features 
are noise artifacts, especially for the cases where they are redward of the Lyman forest, the 
additional presence of up to 4 absorption lines lends further credibility to the reality of the 
O VI candidate. 

We note that the non-detection of a transition in the QSOALS catalogue cannot be equated 
with the non-existence of the feature in the spectrum. In order to enter the QSOALS cat- 
alogue, a f eature need s to p ass certain threshold criteria, set by the automatic program 



detailed in lYork et al.l ( l2005l ). Specifically, the requirement of a four-sigma detection of a 



feature results in an equivalent width sensitivity which depe nds on the signal -to-noise ratio 



but is at least of order 50-200 mA in the absorber restframe (lYork et al.ll2005l ). We are going 
to manually recheck each spectrum separately at the wavelengths where we expect potential 
absorption features, and derive upper limits for lines not detected. These can also help with 
the interpretation of the physical state of the absorbing gas. 

A more difficult task is the assessment of the completeness level of our sample. We have 
noticed that, in some cases, the QSOALS database indicates the presence of absorbing sys- 
tems with O VI which is not being picked up by our search algorithm. We have examined 
closely the results for both search algorithms for the 134 QSO sightlines with 3.2 < Zem < 
3.23. We find that we retrieve 12 out of 37 grade A, B and C absorption systems (containing 
4, 3 or 2 secure fine identifications at the same redshift, respectively)^ with O VI doublet 



12 



For the exact definitions of these grades see iLundgren et all (|2009l ). 
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identification in the QSOALS, whereas none of the 5 grade D systems with O VI is detected 
by our algorithm. On the other hand, we obtain 24 O VI candidates in that subsample that 
do not appear in the QSOALS database, or only show one of the two doublet components. 
What is the reason for the apparently low (~ 30 %) success rate for our algorithm compared 
to the QSOALS search program ? In most of the cases where we do not detect a system, one 
or even both of the O VI doublet transitions are blended, and in some cases there was a lack 
of accompanying Lyman /3 absorption. Furthermore, in a significant fraction of the QSOALS 
candidates, there is a velocity difference between various metal components greater than the 
SDSS pixel separation, as pointed out by York et al. (2006), which leads to an exclusion 
of the candidate in our approach. Thus, it is clear that the rigidity of our search criteria 
does lead to a certain fraction of lost real candidates. In order to meaningfully constrain the 
properties of the IGM at high redshift, the estimate of this lost fraction is absolutely crucial. 
We postpone a full Monte-Carlo analysis of the search efficiency to the next paper in the 
series that will deal with this issue in depth for the absorbers with the highest likelihood of 
being real. 

The results of this crosscheck with the QSOALS database are summarised in Table 3. 

6. Summciry and conclusions 

We have systematically searched for signatures of metal lines in quasar spectra of the 
Sloan Digital Sky Survey (SDSS), focusing on finding intervening absorbers via detection of 
their O VI doublet. In this paper, we have presented our search algorithm, and criteria for 
distinguishing candidates from spurious Lyman alpha forest lines. In addition, we compare 
our findings with simulations of the Lyman alpha forest in order to estimate the detectability 
of O VI doublets over various redshift intervals. We have obtained a sample of 1756 O VI 
doublet candidates in 855 AGN spectra (out of 3702 objects with redshifts in the accessible 
range for O VI detection). This sample is further subdivided into 3 groups according to 
the potential for follow-up of these candidates : 145 of the candidates with the cleanest 
signatures for O VI doublets with high signal-to-noise and high resolution are promising 
candidates for higher resolution spectroscopy in order to better constrain the physical state 
of the absorbers. Seventy-six of these, however, could be intrinsic absorbers as their redshift 
and thus velocity separation from the QSO is less than 5000 km/s. This leaves us with 69 
strong candidates of intervening O VI absorbers at redshifts beyond Zabs — 2.7. 

The efficiency of detecting O VI candidates and filtering out spurious H I Lyman series 
absorbers is a complicated function of redshift and signal-to-noise ratio of each spectrum. 
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Table 1: Overview of searches for intervening O VI absorbers at high redshift. 



Redshift range 


Number of 
QSO sighthnes 


Number of 
VI absorbers 


Reference 


2abs ~ 0.9 


11 


6 systems 


Buries & Tytler (1996) 


1.46 < Zabs < 1.81 


1 


6 systems 


Reimers et al. (2006) 






8 additional candidates 




Zabs ~ 2.5 


7 


~ 50 systems 


Simcoe et al. (2004) 


2.0 < Zem < 2.5 


10 


136 (component) candidates 


Bergeron & 






in 51 systems 


Herbert- Fort (2005) 


2.0 < Zabs < 2.36 


2 


20 (component) candidates 


Carswell et al. (2002) 






in 12 systems 




2.1 < Zabs < 3.1 


35 DLA systems 


9 (system) candidates 


Fox et al. (2007) 


Zflfts — 2.3 


2 lensed QSO pairs 


10 components 


Lopez et al. (2007) 


2.7 < Zabs < 4.2 


3702 


1756 candidates (systems) 
145 Category 1 candidates, 
69 of these intervening 


This work. 
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and thus more flexible search criteria, tailored to each object, could result in a higher overall 
yield than achieved here. We decided, however, to apply rigorous, yet simple cuts to the 
complete sample regardless of redshift. This will allow for a more straightforward compar- 
ison of the results with expectations based upon our initial feasibility study, and certainly 
facilitates quantitative analyses like completeness estimates, which we will undertake in a 
forthcoming paper. 

To summarise, our "blind" pixel-by-pixel search for O VI absorbers in the over 3700 SDSS 
spectra of redshifts beyond = 2.7 complements other studies that have targeted possi- 
ble O VI features by their association to already known, strong absorbers such as dampe d 
Lyman a, Lyman limit systems and even metal-line absorbers ( ISimcoe et al.l |2002| . 12004 ). 
Traditionally, these studies have focused on very high signal-to-noise ratio and resolution, 
at the price of thus being hmited to relatively few individual sources due to the expensive 
nature of the observations. Our sample, residing at Zats > 2.7, therefore greatly increases 
the redshift range of known O VI absorbers at high redshifts. It will allow us to t est ex - 
pectations based upon photoionisation models like the ones presented in lDave et al.l ( 1l998l ). 
thereby possibly constraining the physical conditions of the absorbers and the metagalactic 
UV/X-ray background at high-redshifts. 
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7. Appendix A : Candidate list of OVI Absorbers and Details for best 

candidates 

The following table is an excerpt of the hst of all O VI candidates retrieved from the 3702 
SDSS QSO spectra in our survey. Listed are the unique identifiers of the SDSS object (plate, 
fiber and Julian Date of Observation), as well as the emission redshift of the QSO from the 
SDSS database. The overall quahty flag for each absorber candidate at the given absorber 
redshift, z^bs, has the following meaning (for details see text) : 1 = excellent 2 = mediocre 3 
= extremely poor. The signal-to-noise rate in the forest is the average S/N measured for the 
wavelength area of interest for the O VI search, and often substantially less the nominal limit 
of S/N (total) > 4.0 for an object to enter the SDSS database. The velocity difference between 
the absorber and the QSO is calculated from the given emission line redshift estimate, Zem- 
Absorbers at velocity differences less than 5000 km/s (i.e. /3 < 0.0167) are classifled here as 
'intrinsic' or 'associated', while all other absorbers are tentatively 'intervening'. The table 
is sorted by increasing absorber redshift. 
The full table is available in the electronic version. 



This preprint was prepared with the A AS IM^jX macros v5.2. 



-39- 



Table 3 contains additional information on the intervening O VI candidates with the 
highest hkehhood of being real (category 1), and a comparison with the results of the 
QSOALS database. Listed are the unique identifiers of the SDSS object (plate, fiber and 
Julian Date of Observation), as well as the emission redshift of the QSO from the SDSS 
database. The signal-to-noise rate in the forest is the average S/N measured for the wave- 
length area of interest for the O VI search, and often substantially less the nominal limit 
of S/N (total) > 4.0 for an object to enter the SDSS database. We have also included the 
position of other ionic transitions at the same redshift as the O VI doublet detected in the 
QSOALS database. The first ten entries in the table are the candidates where the QSOALS 
database hsts a secure system. The following 59 sightlines have no securely detected system 
at the redshift of our candidates, however, in certain cases specific single transitions are 
detected. 

The full table is available in the electronic version. 



8. Appendix B : Single Pixel Seeirch vs. Smoothing to the SDSS Resolution 

We have performed the search for OVI absorbers based upon the information in the 
single pixels of the original SDSS spectra. The SDSS detector is built in such a way that 
spectra are slightly oversampled compared to the resolution element achievable with the 
nominal R=1800 resolution. Hence, the question arises : Could we have done better by 
smoothing the spectra according to the oversampling before searching, hereby gaining po- 
tentially significantly in S/N ? The answer to this question depends to a degree on the goals 
that are desired for the sample of candidates. In this Appendix, we compare the pros and 
cons of the two possible approaches, and explain why we have chosen to use the Single Pixel 
Search Method. 

Let us begin by clearly stating our criterion for a successful search algorithm. We want to 
obtain a sample of good quality OVI absorber candidates. With such a sample we want 
to be able to perform two different actions : a. the selection of follow-up candidates with 
higher resolution and S/N. Note that we have set out to construct a sample at higher redshift 
than has ever been tried before, and b. derive basic statistical quantities such as dn/dz and 
density / metaUicity estimates for this sample. Hence we need to try both to maximise the 
number of absorbers found and the cleanliness of the sample. It is immediately obvious that 
increasing the S/N of the basic search bin (pixel vs. resolution element) could indeed be 
very helpful to increase the number of absorbers we can hope to find, given the often very 
low quality of the spectra with S/N per pixel of the order of a few. On the other hand, 
smoothing over a larger bin size increases the probability of HI interlopers falling within the 
bin distorting the flux measurement in a good candidate, or creating new bad candidates by 
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Table 2: List of O VI absorber candidates in all 3702 SDSS QSO spectra of the survey. The 
quality flag has the following meaning meaning : 1 = excellent 2 = mediocre 3 = extremely 
poor. The complete table can be found in the online material. 



Sequence 


Plate 


Fiber 


JD 




2 em 


Quality flag 


S/N (forest) 


P = AJc 


1 


1284 


140 


52736 


2.71413 


2.93300 


3 


3.1 


0.0589 


2 


935 


592 


52643 


2.71670 


3.19500 


3 


6.5 


0.1286 


3 


935 


592 


52643 


2.71755 


3.19500 


3 


6.5 


0.1284 


4 


659 


181 


52199 


2.71927 


2.94200 


3 


4.9 


0.0598 


5 


611 


221 


52055 


2.72184 


2.74100 


3 


3.4 


0.0051 


6 


659 


181 


52199 


2.72269 


2.94200 


3 


4.9 


0.0589 


7 


961 


450 


52615 


2.73127 


3.03400 


2 


5.1 


0.0811 


8 


1264 


74 


52707 


2.73299 


2.97400 


3 


2.8 


0.0645 


9 


291 


612 


51928 


2.73557 


2.80900 


3 


6.4 


0.0196 


10 


611 


221 


52055 


2.73730 


2.74100 


2 


3.4 


0.0009 



Table 3: Details for category 1 O VI absorber candidates. 



Plate 


Fiber 


JD 




^abs 


S/N (forest) 


A CIV 1548 A 


A CIV 1550 A 


A CII 1335 A ; 


819 


530 


52409 


3.0310 


2.9405 


2.1 


6102.3 


6114.2 


5260.0 


971 


508 


52644 


3.0568 


2.9779 


6.0 


6160.1 


6170.2 


5308.6 


830 


431 


52293 


3.3880 


3.3127 


3.3 


6678.4 


6689.7 




302 


438 


51688 


3.5500 


3.4305 


2.3 


6866.8 


6877.8 




1165 


391 


52703 


3.5790 


3.4643 


1.9 


6913.3 




5957.5 
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mimicking an OVI doublet. Therefore, it is a priori not clear which method is favourable. 
In order to decide upon a quantitative argument, we have indeed smoothed all 3702 spectra 
to the nominal resolution of the SDSS spectrograph by running a Gaussian filter with the 
appropriate FWHM over each spectrum. The oversampling factor is about 2.5, and hence 
wc gain roughly a factor of ~ 1.6 in S/N for each pixel. Then we re-ran just the same 
search algorithm over these new spectra. The results compared to the original search can be 
summarised as follows : 

• In the single pixel search, we retrieve 1756 'unique' candidates for absorbers, i.e. cor- 
recting for finding double or multiple candidates within one or two pixels of each other 
in the same spectrum. The search in the smoothed spectra yields 5018 such candidates, 
when applying the same search criteria. 

• There is some overlap : 543 candidates appear in both hsts, i.e. a fraction of 31% of 
the Single-Pixel candidates are found again in the search after smoothing. 

• The ratio of interlopers to 'all' candidates (real and interlopers), as estimated by the 
random search, rises to a much higher value for the search after smoothing than for 
the Single-Pixel search : while we estimate about 780 of the 1756 systems found via 
the latter method to be random interlopers (44%), the same criteria for the results 
obtained with the former method yield a fraction of almost 85%(4230 out of 5018), as 
detailed below. 

It is this latter point that clearly favours the implementation of the one-pixel search strategy, 
given the goal of a candidate list that is as clean as possible under the given noise and 
resolution constraints of SDSS. Note that the number of detected candidates and random 
interloper estimates depends strongly on the cut criteria introduced in section 4.1.1. The 
specific values we have adopted were chosen to achieve an optimal result for the single-pixel 
method. Is it possible to change those parameters for the new search in order to achieve a 
lower fraction of interlopers ? We have tested this by varying the cut values over a wide 
range of the parameter space, just like we did for the original one-pixel search as detailed in 
section 4.1.2., but could not find any combination of them that had the interloper fraction 
drop below 75%. This is highlighted in Figure 10, that can directly be compared to Figure 4 
: it is evident that the number of detections rises by a factor of about 5-8 for all cut criteria, 
but the fraction of random interlopers increases even more dramatically all across the board 
(red lines in the figure). This result is somewhat surprising given that the initial feasibility 
analysis showed that by increasing the S/N the ratio of real-to-random candidates should 
also increase. A potential complication may be the clustering of (stronger) Lyman forest 
features, that could increase the number of interlopers. 
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Fig. 10. — Results of the search for OVI candidates in the spectra that have been smoothed 
by the proper SDSS oversamphng. The number of candidates found for each combination of 
the cut criteria can be directly compared to those with the Single-Pixel approach (Figure 4, 
and dotted lines here). It is evident that the smoothing yields more candidates by a factor 
of 5-8 for each combination, but the increase in the fraction of random interlopers (red lines) 
is even more dramatic. 
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The overlap fraction between candidates found in both the single-pixel approach and 
the search after smoothing may appear on first glance low. Specifically, of the 1756 candidate 
systems from the former method only 543 remain after smoothing (31 %), while our estimate 
of the fraction of real absorbers vs. spurious interlopers and noise artifacts indicates that 69 
percent of the the candidates should show a true OVI profile and hence extend over more 
than one pixel, i.e. are suitable for a successful resolution detection. In the following, we 
investigate how these two seemingly contradicting numbers can be reconciled. 

While binning the spectra to the instrumental resolution of SDSS, i.e. combining the flux 
from 2-3 adjacent pixels, certainly reduces the noise, but since the absorption in neighbouring 
pixels is typically weaker one may actually lower the signal-to-noise ratio of the candidate. 
A priori, it is difficult to assess how the detectabihty of lines is affected by this modification 
of the search routine. Hence, we have resorted to a Monte-Carlo analysis, creating new 
data sets of mock absorbers in resolution space based upon the candidates' specifications 
obtained in the single pixel search. These new sets of candidates were scrutinised by the 
same algorithm we used to search for absorbers in the real, smoothed spectra. 

The recipe for creating one such new data set is as follows : we begin by extracting 
for each candidate the relevant flux and noise value at each pixel in question for the search 
criteria (i.e. OVI 1032 & 1038, Ly and 7 (when possible)). In order to estimate the 
expected flux values in adjacent pixels, we simply create a Voigt proflle which needs to 
reproduce the flux in the central pixel and has an appropriate Dopplcr parameter (flxed 
to b ~ 14 km/s for the OVI, and b ~ 35 km/s for HI). To these expected fluxes in the 2 
adjacent pixels, we add noise at the level found for that central pixel. Then we combine 
the information of the new set of pixels according to the method used for the smoothing of 
the data, i.e. adding and averaging the fluxes while adding the noise in quadrature only. 
Hence, we derive for each candidate of the one pixel search a new set of flux and noise 
values in the relevant transitions. In this way, we construct a new sample of artiflcially 
created potential resolution candidates, based upon the assumption that the one-pixel flux 
values found in our search represent the depression of flux in the centre of the line. The 
new sample is then subjected to the same search algorithm we used for the real rebinned 
spectra, and the fraction of candidates passing the search criteria is noted. We repeat this 
procedure for 10,000 versions of such mock lists, in order to estimate the expectation value 
and its scatter for the number of candidates remaining. Prom this simple procedure, we 
derive an average "retention rate" of 38 ± 3 percent, only shghtly higher than the value 
for the overlap percentage in the real data sets. It seems plausible that allowing for more 
complicated scenarios (line centres not exactly at central pixel, correlated noise, range of 
Doppler parameters for underlying proflles, etc..) the fraction could be pushed even lower, 
and hence we conclude that it is not a sign of our search algorithms being unreliable when 



finding a low overlap percentage. 



